|
In statistics and mathematics, linear least squares is an approach fitting a mathematical or statistical model to data in cases where the idealized value provided by the model for any data point is expressed linearly in terms of the unknown parameters of the model. The resulting fitted model can be used to summarize the data, to predict unobserved values from the same system, and to understand the mechanisms that may underlie the system. Mathematically, linear least squares is the problem of approximately solving an overdetermined system of linear equations, where the best approximation is defined as that which minimizes the sum of squared differences between the data values and their corresponding modeled values. The approach is called "linear" least squares since the assumed function is linear in the parameters to be estimated. Linear least squares problems are convex and have a closed-form solution that is unique, provided that the number of data points used for fitting equals or exceeds the number of unknown parameters, except in special degenerate situations. In contrast, non-linear least squares problems generally must be solved by an iterative procedure, and the problems can be non-convex with multiple optima for the objective function. If prior distributions are available, then even an underdetermined system can be solved using the Bayesian MMSE estimator. In statistics, linear least squares problems correspond to a particularly important type of statistical model called linear regression which arises as a particular form of regression analysis. One basic form of such a model is an ordinary least squares model. The present article concentrates on the mathematical aspects of linear least squares problems, with discussion of the formulation and interpretation of statistical regression models and statistical inferences related to these being dealt with in the articles just mentioned. See outline of regression analysis for an outline of the topic. == Example == As a result of an experiment, four data points were obtained, and (shown in red in the picture on the right). We hope to find a line that best fits these four points. In other words, we would like to find the numbers and that approximately solve the overdetermined linear system : of four equations in two unknowns in some "best" sense. The "error", at each point, between the curve fit and the data is the difference between the right- and left-hand sides of the equations above. The least squares approach to solving this problem is to try to make as small as possible the sum of the squares of these errors; that is, to find the minimum of the function : The minimum is determined by calculating the partial derivatives of with respect to and and setting them to zero : : This results in a system of two equations in two unknowns, called the normal equations, which give, when solved : : and the equation of the line of best fit. The residuals, that is, the discrepancies between the values from the experiment and the values calculated using the line of best fit are then found to be and (see the picture on the right). The minimum value of the sum of squares of the residuals is More generally, one can have regressors , and a linear model :. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Linear least squares (mathematics)」の詳細全文を読む スポンサード リンク
|